Lecture 3(Partl ) 



Topics covered: 
CPU Architecture 



Fetch/execute cycle of an instruction 

□ Step I: 

♦ Fetch the contents of the memory location pointed to by 
Program Counter (PC). 

♦ PC points to the memory location which has the instruction to 
be executed. 

♦ Load the contents of the memory location into Instruction 
Register (IR). 

□ Step II: 

♦ Increment the contents of the PC by 4 (assuming the memory is 
byte addressable and the word length is 32 bits). 

□ Step III: 

♦ Carry out the operation specified by the instructions in the IR. 

□ Steps I and II constitute the fetch phase, and are repeated 
as many times as necessary to fetch the complete 
instruction. 

□ Step III constitutes the execution phase. 



i 



Internal organization of a processor 



□ Recall that a processor has several registers/building 
blocks: 

♦ Memory address register (MAR) 

♦ Memory data register (MDR) 

♦ Program Counter (PC) 

♦ Instruction Register (IR) 

♦ General purpose registers RO - R(n-l) 

♦ Arithmetic and logic unit (ALU) 

♦ Control unit. 

□ How are these units organized and how do they communicate 
with each other? 
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Internal organization of a processor 
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Single bus organization 

□ Single bus organization: 

♦ ALU, control unit and all the registers are connected via a 
single common bus. 

♦ Bus is internal to the processor and should not be confused 
with the external bus that connects the processor to the 
memory and I/O devices. 

□ Data lines of the external memory bus are connected to the 
internal processor bus via MDR. 

♦ Register MDR has two inputs and two outputs. 

♦ Data may be loaded to (from) MDR from (to) internal processor 
bus or external memory bus. 

□ Address lines of the external memory bus are connected to 
the internal processor bus via MAR. 

♦ MAR receives input from the internal processor bus. 

♦ MAR provides output to external memory bus. 
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Single bus organization (contol..) 



□ Instruction decoder and control logic block, or control unit 
issues signals to control the operation of all units inside the 
processor and for interacting with the memory bus. 

♦ Control signals depend on the instruction loaded in the 
Instruction Register (IR) 

□ Outputs from the control logic block are connected to: 

♦ Control lines of the memory bus. 

♦ ALU, to determine which operation is to be performed. 

♦ Select input of the multiplexer MUX to select between 
Register Y and constant 4. 

♦ Control lines of the registers, to select the registers. 
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<^> Single bus organization (contol..) 



□ Registers Y, Z, and TEMP: 

♦ Used by the processor for temporary storage during execution 
of some instructions. 

♦ Note that Registers RO to R(n-l) are used to store data 
generated by one instruction for later use by another 
instruction. 

♦ Data is stored in RO through R(n-l) after the execution of an 
instruction. 

□ Multiplexer MUX selects either the output of register Y or 
a constant 4, depending upon the control input Select. 

♦ Constant 4 is used to increment the value of the PC. 
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Registers and the bus 
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Registers and the bus (contol..) 



□ A bus may be viewed as a collection of parallel wires. 

□ Buses have no memory: 

♦ They are just a collection of wires. 

□ When data is on the bus, all registers can "see" that data at 
their inputs. 

□ A register may place its contents onto the bus. 
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Registers and the bus (contol..) 



□ At any one time, only one register may output its contents to 
the bus: 

♦ Which register outputs its content to the bus is determined by 
the control signal issued by the control logic. 

♦ Control signal depends on the instruction loaded in the 
instruction register IR. 

□ Registers can load data from the bus: 

♦ Which registers load data from the bus is determined by the 
control signal issued by the control logic. 

□ Registers are clocked (sequential) entities (unlike ALU which 
is purely combinatorial). 
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Registers are connected to the bus via 
switches controlled by the signals 
Rin & Rout. 

Each register Ri has two control signals, 

Ri in and Ri out 

If Ri in =l, the data from the bus is loaded 
into the register. 

If Ri out =l, the data from the register is 
loaded onto the bus. 



The same holds for registers Fand Zas 
well. 



Registers and the bus (contol..) 
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•Each bit in a register may be implemented by an edge-triggered D flip flop. 

• Two input multiplexer is used to select the data applied to the input of an 
edge triggered flip-flop. 

• Q output of the flip-flop is connected to the bus via a tri-state gate. 
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Registers and the bus (contol..) 
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Ri jn = 1: 

Multiplexer selects the data on the bus. 

Data is loaded into the flip-flop at the rising edge of the clock. 

RL = 0: 

Multiplexer feeds back the value currently stored in the flip- flop. 
Q output represents the value currently stored in the flip-flop. 
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Registers and the bus (contol..) 
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Tri- state gate loads the value of the flip-flop onto the bus. 
Data is loaded onto the bus at the rising edge of the clock. 



R i 0 ut - 0: 



Gate 's output is in high-impedance (electrically disconnected) state. 
Corresponds to open-circuit state. 
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Registers and the bus (contol..) 



Operation of a tri-state gate 



•A tri-state gate can enter one of three output states. 

- its output can be in a logic low state (L). 

- its output can be in a logic high state (H). 

- its output can be effectively an open-circuit (high impedance) 

• When a tri-state gate is connected to a bus in high-impedance state, its outputs 
are effectively disconnected from the bus. 

Ri out = 1, output is: Ri out = 0: 

Logic low, if Q = 0 High impedance 

Logic high, if Q = 1 Open circuit condition 



Bus 



Bus 
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(y> Registers and the bus (contol..) 

Operation of an edge-triggered flip-flop 




•Data is loaded from the register to the bus (or to the register from the bus) 

at the rising edge of the clock. 

•Data is loaded at the L-H transition of the clock. 
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Registers and the bus (contol..) 



□ Data transfers and operations take place within time periods 
defined by the processor clock. 

♦ Time period is known as the clock cycle. 

□ At the beginning of the clock cycle, the control signals that 
govern a particular transfer are asserted. 

♦ For e.g., if the data are to be transferred from register RO to 
the bus, then R0 out is set to 1. 

□ Edge-triggered flip-flop operation explained earlier used 
only the rising edge of the clock for data transfer. 

♦ Other schemes are possible, for example, data transfers may 
use rising and falling edges of the clock. 

□ When edge-triggered flip-flops are not used, two or more 
clock signals may be needed to guarantee proper transfer of 
data. This is known as multiphase clocking . 
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Simple register transfer example 



Transfer the contents of register R3 to register R4 




1. Control signals R3 out andR4 in become 1. They stay valid until the end of 
the clock cycle. 

2. After a small delay the contents ofR3 are placed onto the bus. The contents 
ofR3 stay onto the bus until the end of the clock cycle. 

3. At the end of the clock cycle, the data onto the bus is loaded into R4. R3 out 
and R4,„ become 0. 
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Loading multiple registers from the bus 



Transfer the contents of register R3 to register R4, R5 




1. Control signals R3 ouP R4 in and R5 in become 1. They stay valid until the end of 
the clock cycle. 

2. After a small delay, the contents ofR3 are placed onto the bus. The contents 
ofR3 stay onto the bus until the end of the clock cycle. 

3. At the end of the clock cycle, the data onto the bus is loaded into R4 and R5. 
R3„„„ R4 ; „ and R5 ; „ become 0. 
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Loading multiple registers from the bus (contd..) 



□ It is possible to load multiple registers simultaneously from 
the bus. 

♦ For e.g., transfer the contents of register R3 to registers R4 
and R7 simultaneously. 

□ The number of registers that can be simultaneously loaded 
depends on: 

♦ Drive capability (fan-out) 

♦ Noise. 

♦ Note that this is an electrical issue, not a logical issue. 

□ Distinguish this from multiple registers loading the bus: 

♦ For e.g. load the contents of registers R3 and R4 onto the bus 
simultaneously. 

♦ Logically inconsistent event. 

♦ Physically dangerous event. 
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Arithmetic Logic Unit (ALU) 



□ ALU is q purely combinatorial device: 

♦ It has no memory or internal storage. 

□ It has 2 input vectors: 

♦ These may be called the A- and B-vector or the R- and S-vector 

♦ The inputs are as wide as the registers/system bus (e.g.. 16, 32 
bits) 

□ It has 1 output vector 

♦ Usually denoted F 
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Arithmetic Logic Unit (ALU) (contcU 



Sample functions performed by the ALU 

• F = A+B F = A+B+l 

• F = A-B F = A-B-l 

• F e A and B F = A or B 

• F e not A F e not B 

• F e not A + 1 F e not B + 1 

• F = (not A) and BF = A and (not B) 

• F = A xor B F = not (A xor B) 

• F = A F = B 
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Arithmetic Logic Unit (ALU) (contcU 
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Arithmetic and Logic Unit (ALU) (contcL) 



ALU connections to the bus 




• ALU must have only one input connection 
from the bus. 

• The other input must be stored in a holding 
register called Y register. 

• A multiplexer selects among register Y and 4 
depending upon select line. 

• One operand of a two-operand instruction must 
be placed into the Y register before the other 
operand must be placed onto the bus. 



Processor bus 



v 



23 



Arithmetic and Logic Unit (ALU) (contcL) 



ALU connections to the bus 




• Identical reasoning tells us that there must 
be an output register Z which collects the 
output of the ALU at the end of each cycle. 
• This way, there can be 
--one operand in the Y register 
-one operand ON THE BUS 
—the result stored in the Z register 



Processor bus 
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Performing an arithmetic operation 



Add the contents of registers Rl and R2 and place the result in R3. 

That is: R3 = R1+R2 

1. Place the contents of register Rl into the Y register in the first 
clock cycle. 

2. Place the contents of register R2 onto the bus in the second 
clock cycle. 

Both inputs to the ALU are now valid. Select register X and 
assert the AL U command F=A +B. 

3. In the third clock cycle, Z register has latched the output of the 
ALU. Thus the contents of the Z register can be copied into 
register R3. 
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Performing an arithmetic operation (contcL) 
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Performing an arithmetic operation (contcL) 
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Performing an arithmetic operation (contcL) 
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Performing an arithmetic operation (contcL) 



Clock Cycle 1: 

Rlou*Yin (Y=R1) 

Clock Cycle 2: 

R2 £r r SelectX Add (Z=R1+R2) 

Clock Cycle 3: 

ZatMn ( R3=Z) 
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Performing an arithmetic operation (contcL) 



□ Inputs of the ALU: 

♦ Input B is tied to the bus. 

♦ Input A is tied to the output of the multiplexer. 

□ Output of the ALU: 

♦ Tied to the input of the Z register. 

□ Z register: 

♦ Input tied to the output of the ALU. 

♦ Output tied to the bus. 

♦ Unlike Ri M . Z M loads data from the output of the ALU and not 
the bus. 



<y> Performing an arithmetic operation (contcL) 
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ALU operations 



□ RC = RA opRB 

□ Clock cycle 1: 

♦ Move RA to Fregister. 
^ RA ou p_Yj n 

□ Clock cycle 2: 

♦ Put RB on the bus, perform F=RA opRB, and transfer the 
result to Z 

♦ RBqu u (RA ovRB)=L Selects Z ± 

□ Clock cycle 3: 

♦ Put Zon the bus, and load Zinto RC. 

♦ Z out? RC in 



<^> Fetching a word from memory 

□ Processor has to specify the address of the memory location 
where this information is stored and request a Read 
operation. 

□ Processor transfers the required address to MAR. 

♦ Output of MAR is connected to the address lines of the 
memory bus. 

□ Processor uses the control lines of the memory bus to 
indicate that a Read operation is needed. 

□ Requested information are received from the memory and 
are stored in MDR. 

♦ Transferred from MDR to other registers. 
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Fetching a word from memory (contcU 



Connections for register MDR 
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Fetching a word from memory (contcU 



□ Timing of the internal processor operations must be 
coordinated with the response time of memory Read 
operations. 

□ Processor completes one internal data transfer in one clock 
cycle. 

□ Memory response time for a Read operation is variable and 
usually longer than one clock cycle. 

♦ Processor waits until it receives an indication that the 
requested Read has been completed. 

♦ Control signal Memory Function Completed (MFC) is used for 
this purpose. 

♦ MFC is set to 1 by the memory to indicate that the contents of 
the specified location have been read and are available on the 
data lines of the memory bus. 
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<y> Fetching a word from memory (contcU 



MOVE(RllR2 

1. Load the contents of Register Rl into MAR. 

2. Start a Read operation on the memory bus. 

3. Wait for MFC response from the memory. 

4. Load MDR from the memory bus. 

5. Load the contents of MDR into Register R2. 



Steps can be performed 
separately, some may be 
combined. 



1. Steps 1 and 2 can be combined. 

- Load Rl^o MARcmd activate Read control signal simultaneously. 

2. Steps 3 and 4 can be combined. 

- Activate control signal MDR M while waiting for response from 
the memory MFC. 

3. Last step loads the contents of MPflinto Register R2. 

\Aex\ce. Memory Read operation takes 3 steps. 
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Fetching a word from memory (contcU 



MOVE (Rl) , R2 Memory operation takes 3 steps. 

Step 1: 

- Place 7?/ onto the internal processor bus. 

- Load the contents of the bus into MAR. 

- Activate the Read control signal. 
- Rl^ MAR : „, Read. 



Step 2: 

- Wait for MFC from the memory. 

- Activate the control signal to load data from external bus to MDR. 
- MDR^ WMFC 

Step 3: 

- Place the contents of MDRon^o the internal processor bus. 

- Load the contents of the bus into Register R2. 

- MDR outIL R2 in 
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Storing a word into memory 



MOVER2, (Rl): Memory operation takes 3 steps. 



Step 1: 

- Place Rl onto the internal processor bus. 

- Load the contents of the internal processor bus into MAR. 



Step 2: 

- Place R2oyxXo the internal processor bus. 

- Load the contents of the internal processor bus into MDR. 

- Activate Write operation. 



- R2 ^. MDR . „, Write 



Step 3: 

- Place the contents of MDR into the external memory bus. 

- Wait for the memory write operation to be completed MFC. 



- MDR „. ^ WMFC 
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<y> Execution of a complete instruction 

Add the contents of a memory location pointed to by Register R3 

to register RL 
ADD (R3h Rl 



To execute the instruction we must execute the following tasks: 

1. Fetch the instruction. 

2. Fetch the operand (contents of the memory location pointed to by R3.) 

3. Perform the addition. 

4. Load the result into Rl. 
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<y> Execution of a complete instruction 

Task 1: Fetch the instruction 

Recall that: 

- PC holds the address of the memory location which has the next 
instruction to be executed. 

- IR holds the instruction currently being executed. 
Step 1 

- Load the contents of PCto MAR. 

- Activate the Read control signal. 

- Increment the contents of the PC by 4. 
- PC^MAR , ^ Read, Selects Add, Z ^. 

Step 2 

- Update the contents of the PC. 

- Copy the updated PC to Register Y (useful for Branch instructions). 

- Activate the control signal to load data from external bus to MDR 

- Wait for MFCf rom memory. 

PC : „, K ,,, MDR^ WMFC 

Step 3 

- Place the contents of MDR onto the bus. 

- Load the IR with the contents of the bus. 
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Execution of a complete instruction (contcL) 



Task 2. Fetch the operand (contents of memory pointed to by R3.) 
Task 3. Perform the addition. 
Task 4. Load the result into Rl. 



Step 4: - Place the contents of Register 7?Jonto internal processor bus. 

- Load the contents of the bus onto MAR. 

- Activate the Read control signal. 



- R3 ^ MAR^ Read 



Step 5: - Place the contents of Rl onto the bus. 

-Load the contents of the bus into Register FfRecall one operand in Y). 

- Wait for MFC. 

- RIqukY ^ MDR i m WMFC 



Step 6: - Load the contents of MDR onto the internal processor bus. 

-Select Kand perform the addition. 

- Place the result in Z. 



MDR „ r „ Selects Add Z ; „. 



Step 7: - Place the contents of Register Zonto the internal processor bus. 
- Place the contents of the bus into Register Rl. 



—^ouoMlin 
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Execution of a complete instruction (contcL) 

ADD (R3h Rl 



Step Action 

1 PCnut , MAR in, Read, Selects Add, Z k 

2 Znut, PCin, Yin, WMF C 

3 MDRout.lRin 

4 R3put , MAR in. Read 

5 R1nut, Yin, WMFC 

6 MDR nnt ,SelectY, Add, Z t 

7 Znut, R1,n,End 
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Branch instructions 



□ Recall that the updated contents of the PCare copied into 
Register Fin Step 2. 

♦ Not necessary for ADD instruction, but useful in BRANCH 
instructions.: 

♦ Branch target address is computed by adding the updated 
contents of the PCio an offset. 

□ Copying the updated contents of the PC\o Register Y 
speeds up the execution of BRANCH instruction. 

□ Since the Fetch cycle is the same for all instructions, this 
step is performed for all instructions. 

♦ Since Register Y\s not used for any other purpose at that time 
it does not have any impact on the execution of the instruction. 
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Unconditional Branch instructions 



Step Action 

1 PC^, MARjm Read, Select4, Add, Z,„ 
PQ„, Y <n , WMFC 

3 MDR^, IR, n 

4 Offset-field-of-IR,^, Add, Z in .SelectY 
■ "out I PQn, End 

Figure 7.7 Control sequence br on unconditional 
Branch instruction. 
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<^> Conditional Branch instructions 



Consider now a conditional branch, In this case, we need to check the status of the 
condition codes before loading a new value into the PC, For example, for a Branch-on- 
negative (Branch <0) instruction, step 4 in Figure 7,7 is replaced with 

Qfifset-field-of-ER^Hr , Add, Zj„ ,SelectY, If N = 0 then End 

Thus, if N=0 the processor returns to step 1 immediately after step 4. If N= 1, 
step 5 is performed to load a new value into the PC, thus performing the branch 
operation. 
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